People constantly use language to learn about the world. Computational linguists have capitalized on this fact to build large language models (LLMs) that acquire co-occurrence-based knowledge from language corpora. LLMs achieve impressive performance on many tasks, but the robustness of their world knowledge has been questioned. Here, we ask: do LLMs acquire generalized knowledge about real-world events? Using curated sets of minimal sentence pairs (n=1215), we tested whether LLMs are more likely to generate plausible event descriptions compared to their implausible counterparts. We found that LLMs systematically distinguish possible and impossible events (The teacher bought the laptop vs. The laptop bought the teacher) but fall short of human performance when distinguishing likely and unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM scores are driven by both plausibility and surface-level sentence features, (ii) LLMs generalize well across syntactic sentence variants (active vs passive) but less well across semantic sentence variants (synonymous sentences), (iii) some, but not all LLM deviations from ground-truth labels align with crowdsourced human judgments, and (iv) explicit event plausibility information emerges in middle LLM layers and remains high thereafter. Overall, our analyses reveal a gap in LLMs' event knowledge, highlighting their limitations as generalized knowledge bases. We conclude by speculating that the differential performance on impossible vs. unlikely events is not a temporary setback but an inherent property of LLMs, reflecting a fundamental difference between linguistic knowledge and world knowledge in intelligent systems.
translated by 谷歌翻译
The usage of technologically advanced devices has seen a boom in many domains, including education, automation, and healthcare; with most of the services requiring Internet connectivity. To secure a network, device identification plays key role. In this paper, a device fingerprinting (DFP) model, which is able to distinguish between Internet of Things (IoT) and non-IoT devices, as well as uniquely identify individual devices, has been proposed. Four statistical features have been extracted from the consecutive five device-originated packets, to generate individual device fingerprints. The method has been evaluated using the Random Forest (RF) classifier and different datasets. Experimental results have shown that the proposed method achieves up to 99.8% accuracy in distinguishing between IoT and non-IoT devices and over 97.6% in classifying individual devices. These signify that the proposed method is useful in assisting operators in making their networks more secure and robust to security breaches and unauthorized access.
translated by 谷歌翻译
Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.
translated by 谷歌翻译
Machine learning (ML) models can leak information about users, and differential privacy (DP) provides a rigorous way to bound that leakage under a given budget. This DP budget can be regarded as a new type of compute resource in workloads of multiple ML models training on user data. Once it is used, the DP budget is forever consumed. Therefore, it is crucial to allocate it most efficiently to train as many models as possible. This paper presents the scheduler for privacy that optimizes for efficiency. We formulate privacy scheduling as a new type of multidimensional knapsack problem, called privacy knapsack, which maximizes DP budget efficiency. We show that privacy knapsack is NP-hard, hence practical algorithms are necessarily approximate. We develop an approximation algorithm for privacy knapsack, DPK, and evaluate it on microbenchmarks and on a new, synthetic private-ML workload we developed from the Alibaba ML cluster trace. We show that DPK: (1) often approaches the efficiency-optimal schedule, (2) consistently schedules more tasks compared to a state-of-the-art privacy scheduling algorithm that focused on fairness (1.3-1.7x in Alibaba, 1.0-2.6x in microbenchmarks), but (3) sacrifices some level of fairness for efficiency. Therefore, using DPK, DP ML operators should be able to train more models on the same amount of user data while offering the same privacy guarantee to their users.
translated by 谷歌翻译
Automatic medical image classification is a very important field where the use of AI has the potential to have a real social impact. However, there are still many challenges that act as obstacles to making practically effective solutions. One of those is the fact that most of the medical imaging datasets have a class imbalance problem. This leads to the fact that existing AI techniques, particularly neural network-based deep-learning methodologies, often perform poorly in such scenarios. Thus this makes this area an interesting and active research focus for researchers. In this study, we propose a novel loss function to train neural network models to mitigate this critical issue in this important field. Through rigorous experiments on three independently collected datasets of three different medical imaging domains, we empirically show that our proposed loss function consistently performs well with an improvement between 2%-10% macro f1 when compared to the baseline models. We hope that our work will precipitate new research toward a more generalized approach to medical image classification.
translated by 谷歌翻译
Understanding why a model makes certain predictions is crucial when adapting it for real world decision making. LIME is a popular model-agnostic feature attribution method for the tasks of classification and regression. However, the task of learning to rank in information retrieval is more complex in comparison with either classification or regression. In this work, we extend LIME to propose Rank-LIME, a model-agnostic, local, post-hoc linear feature attribution method for the task of learning to rank that generates explanations for ranked lists. We employ novel correlation-based perturbations, differentiable ranking loss functions and introduce new metrics to evaluate ranking based additive feature attribution models. We compare Rank-LIME with a variety of competing systems, with models trained on the MS MARCO datasets and observe that Rank-LIME outperforms existing explanation algorithms in terms of Model Fidelity and Explain-NDCG. With this we propose one of the first algorithms to generate additive feature attributions for explaining ranked lists.
translated by 谷歌翻译
Handwriting Recognition has been a field of great interest in the Artificial Intelligence domain. Due to its broad use cases in real life, research has been conducted widely on it. Prominent work has been done in this field focusing mainly on Latin characters. However, the domain of Arabic handwritten character recognition is still relatively unexplored. The inherent cursive nature of the Arabic characters and variations in writing styles across individuals makes the task even more challenging. We identified some probable reasons behind this and proposed a lightweight Convolutional Neural Network-based architecture for recognizing Arabic characters and digits. The proposed pipeline consists of a total of 18 layers containing four layers each for convolution, pooling, batch normalization, dropout, and finally one Global average pooling and a Dense layer. Furthermore, we thoroughly investigated the different choices of hyperparameters such as the choice of the optimizer, kernel initializer, activation function, etc. Evaluating the proposed architecture on the publicly available 'Arabic Handwritten Character Dataset (AHCD)' and 'Modified Arabic handwritten digits Database (MadBase)' datasets, the proposed model respectively achieved an accuracy of 96.93% and 99.35% which is comparable to the state-of-the-art and makes it a suitable solution for real-life end-level applications.
translated by 谷歌翻译
Atrial Fibrillation (AF) is characterized by disorganised electrical activity in the atria and is known to be sustained by the presence of regions of fibrosis (scars) or functional cellular remodeling, both of which may lead to areas of slow conduction. Estimating the effective conductivity of the myocardium and identifying regions of abnormal propagation is therefore crucial for the effective treatment of AF. We hypothesise that the spatial distribution of tissue conductivity can be directly inferred from an array of concurrently acquired contact electrograms (EGMs). We generate a dataset of simulated cardiac AP propagation using randomised scar distributions and a phenomenological cardiac model and calculate contact electrograms at various positions on the field. A deep neural network, based on a modified U-net architecture, is trained to estimate the location of the scar and quantify conductivity of the tissue with a Jaccard index of $91$%. We adapt a wavelet-based surrogate testing analysis to confirm that the inferred conductivity distribution is an accurate representation of the ground truth input to the model. We find that the root mean square error (RMSE) between the ground truth and our predictions is significantly smaller ($p_{val}=0.007$) than the RMSE between the ground truth and surrogate samples.
translated by 谷歌翻译
Causal structure learning from observational data remains a non-trivial task due to various factors such as finite sampling, unobserved confounding factors, and measurement errors. Constraint-based and score-based methods tend to suffer from high computational complexity due to the combinatorial nature of estimating the directed acyclic graph (DAG). Motivated by the `Cause-Effect Pair' NIPS 2013 Workshop on Causality Challenge, in this paper, we take a different approach and generate a probability distribution over all possible graphs informed by the cause-effect pair features proposed in response to the workshop challenge. The goal of the paper is to propose new methods based on this probabilistic information and compare their performance with traditional and state-of-the-art approaches. Our experiments, on both synthetic and real datasets, show that our proposed methods not only have statistically similar or better performances than some traditional approaches but also are computationally faster.
translated by 谷歌翻译
We propose KnowGL, a tool that allows converting text into structured relational data represented as a set of ABox assertions compliant with the TBox of a given Knowledge Graph (KG), such as Wikidata. We address this problem as a sequence generation task by leveraging pre-trained sequence-to-sequence language models, e.g. BART. Given a sentence, we fine-tune such models to detect pairs of entity mentions and jointly generate a set of facts consisting of the full set of semantic annotations for a KG, such as entity labels, entity types, and their relationships. To showcase the capabilities of our tool, we build a web application consisting of a set of UI widgets that help users to navigate through the semantic data extracted from a given input text. We make the KnowGL model available at https://huggingface.co/ibm/knowgl-large.
translated by 谷歌翻译